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Abstract 

In this paper, we propose distributed solvers for systems of linear equations given 
by symmetric diagonally dominant M-matrices based on the parallel solver of 
Spielman and Peng. We propose two versions of the solvers, where in the hrst, full 
communication in the network is required, while in the second communication 
is restricted to the R-Hop neighborhood between nodes for some R > 1. We 
rigorously analyze the convergence and convergence rates of our solvers, showing 
that our methods are capable of outperforming state-of-the-art techniques. 

Having developed such solvers, we then contribute by proposing an accurate dis¬ 
tributed Newton method for network flow optimization. Exploiting the sparsity 
pattern of the dual Hessian, we propose a Newton method for network flow op¬ 
timization that is both faster and more accurate than state-of-the-art techniques. 
Our method utilizes the distributed SDDM solvers for determining the Newton 
direction up to any arbitrary precision e > 0. We analyze the properties of our al¬ 
gorithm and show superlinear convergence within a neighborhood of the optimal. 
Einally, in a set of experiments conducted on randomly generated and barbell net¬ 
works, we demonstrate that our approach is capable of significantly outperforming 
state-of-the-art techniques. 


1 Introduction 

Solving systems of linear equations given by symmetric diagonally matrices (SDD) is of interest 
to researchers in a variety of fields. Such constructs, for example, are used to determine solutions 
to partial differential equations ifTSl and computations of maximum flows in graphs lfT9ll20ll . Other 
application domains include machine learning 1121] [22, and computer vision 1(34^ 


'This research is supported in parts by by ONR grant Number NOOO14-12-1-0997 and AFOSR grant 
FA9550-13-1-0097. 
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Much interest has been devoted to determining fast algorithms for solving SDD systems. Recently, 
Spielman and Teng ll24l . utilized the multi-level framework of ||25]| . pre-conditioners ll26l . and spec¬ 
tral graph sparsifiers lIZTll . to propose a nearly linear-time algorithm for solving SDD systems. Fur¬ 
ther exploiting these ingredients, Koutis et. al 12^1291 developed an even faster algorithm for acquir¬ 
ing e-close solutions to SDD linear systems. Further improvements have been discovered by Kelner 
et. al ll^ . where their algorithm relied on only spanning-trees and eliminated the need for graph 
sparsifiers and the multi-level framework. 

Motivated by applications, much progress has been made in developing parallel versions of these 
algorithms. Koutis and Miller m proposed an algorithm requiring nearly-linear work (i.e., total 
number of operations executed by a computation) and depth (i.e., longest chain of sequential 
dependencies in the computation) for planar graphs. This was then extended to general graphs in ll^ 
leading to depth close to Since then, Peng and Spielman im have proposed an efficient parallel 
solver requiring nearly-linear work and poly-logarithmic depth without the need for low-stretch 
spanning trees. Their algorithm, which we provide a distribute construction for, requires sparse 
approximate inverse chains im which facilitates the solution of the SDD system. 

Less progress, on the other hand, has been made on the distributed version of these solvers. Con¬ 
trary to the parallel setting, memory is not shared and is rather distributed in the sense that each 
unit abides by its own memory restrictions. Furthermore, communication in a distributed setting 
fundamentally relies on message passing through communication links. Current methods, e.g., Ja¬ 
cobi iteration nsiini, can be used for such distributed solutions but require substantial complexity. 
In ll^ . the authors propose a gossiping framework for acquiring a solution to SDDM systems in 
a distributed fashion. Recent work ll^ considers a local and asynchronous solution for solving 
systems of linear equations, where they acquire a bound on the number of needed multiplication 
proportional to the degree and condition number of the graph for one component of the solution 
vector. 


Contributions: In this paper, we propose a fast distributed solver for linear equations given by 
symmetric diagonally dominant M-Matrices. Our approach distributes the parallel solver in CD by 
considering a specific approximated inverse chain which can be computed efficiently in a distributed 
fashion. We develop two versions of the solver. The first, requires full communication in the network, 
while the second is restricted to R-Hop neighborhood of nodes for some i? > 1. Similar to the 
work in CD, our algorithms operate in two phases. In the first, a “crude” solution to the system of 
equations is retuned, while in the second a distributed R-Hop restricted pre-conditioner is proposed 
to drive the “crude” solution to an e-approximate one for any e > 0. Due to the distributed nature 
of the setting considered, the direct application of the sparsfier and pre-conditioner of Peng and 
Spielman CD is difficult due to the need of global information. Consequently, we propose a new 
sparse inverse chain which can be computed in a decentralized fashion for determining the solution 
to the SDDM system. 

Interestingly, due to the involvement of powers of matrices with eigenvalues less than one, our 
inverse chain is substantially shorter compared to that in CD - This leads us to a distributed SDDM 
solver with lower computational complexity compared to state-of-the-art methods. Specifically, our 
algorithm’s complexity is given by 


O 



RWrr, 


log 



with n being the number of nodes in graph Q, W^ax and Wmin denoting the largest and smaller 
weights of the edges in Q, respectively, a = min < n, ( representing the upper bound on the 

size of the R-Hop neighborhood Vv G V, and e G (0, being the precision parameter. Furthermore, 
our approach improves current linear methods by a factor of log n and by a factor of the degree 
compared to ll34l for each component of the solution vector. 

Having developed such distributed solvers, we next contribute by proposing an accurate distributed 
Newton method for network flow optimization. Exploiting the sparsity pattern of the dual Hessian, 
we propose a Newton method for network optimization that is both faster and more accurate than 
state-of-the-art techniques. Our method utilizes the proposed SDDM distributed solvers to approx¬ 
imate the Newton direction up to any arbitrary e > 0. The resulting algorithm is an efficient and 
accurate distributed second-order method which performs almost identically to exact Newton. We 
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analyze the properties of the proposed algorithm and show that, similar to conventional Newton 
methods, superlinear convergence within a neighborhood of the optimal value is attained. We finally 
demonstrate the effectiveness of the approach in a set of experiments on randomly generated and 
Barbell networks. 


2 The parallel SDDM Solver 

We now review the parallel solver for symmetric diagonally dominant (SDD) linear systems HD. 


2.1 Problem Setting 

As detailed in HD, SDDM solvers consider the following system of linear equations; 

MqX = bo ( 1 ) 

where Mq is a Symmetric Diagonally Dominant M-Matrix (SDDM). Namely, Mq is symmetric 
positive definite with non-positive off diagonal elements, such that for alH = 1, 2,..., n: 

n 

[MoU>- 

The system of Equations in [T] can be interpreted as representing an undirected weighted graph, 
Q, with Mq being its Laplacian. Namely, Q = {V,£, W), with V representing the set of nodes, 
£ denoting the edges, and W representing the weighted graph adjacency. Nodes Vi and Vj are 
connected with an edge e = {i,j) iff Wij > 0, where; 

= - [Mo],^ . 

Following HD, we seek e-approximate solutions to x*, being the exact solution of MqX = bo, 
defined as; 


e— Approximate Solution Let x* G M" be the solution of Mox = bo. A vector x G M" is called 
an e— approximate solution, if; 

\\x*-x\\j^^ <e\\x*\\M„, where I= m'tM olt. (2) 

The R-hop neighbourhood of node is defined as (u^) = {n S V ; dist {v^, v) < R}. We also 
make use of the diameter of a graph, Q, defined as diam {Q) = maxi,; dist {vi, Vj). 


Sparsity Pattern We say that a matrix A G has a sparsity pattern corresponding to the R-hop 

neighborhood if Aij = 0 for alH = 1,..., n and for all j such that Vj ^ Nr {vi). 


We will denote the spectral radius of a matrix A by p{A) = max|Ai|, where Ai represents an 
eigenvalue of the matrix A. Furthermore, we will make use of the condition numbein k (A) of 


a matrix A defined as k 


^ma.x (-^) 

^min (-^) 


In ifTTIl it is shown that the condition number of the graph 


Laplacian is at most 


O 







where Wmax and Wmin represent the largest and the smallest edge weights in Q. Finally, the condition 
number of a sub-matrix of the Laplacian is at most O (n^ j, see ifTD . 


2.2 Standard Splittings & Approximations 

Our first contribution is a distributed version of the parallel solver for SDDM systems of equations 
previously proposed in HD- Before detailing our solver, however, we next introduce basic mathe¬ 
matical machinery needed for developing the parallel solver of HD . The parallel solver commences 
by considering the standard splitting of the symmetric matrix Mq : 

^Please note that in the case of the graph Laplacian, the condition number is defined as the ratio of the 
largest to the smallest nonzero eigenvalues. 
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Definition The standard splitting of a symmetric matrix Mq is; 


ATo = £>0 — ^0- 


Here, £)o is a diagonal matrix consisting of the diagonal elements in ATg such that; 

= [Mo],, Vt = l,2,...,n. 

Furthermore, Aq is a non-negative symmetric matrix such that; 




- [Mojy : if i j, 

0 ; otherwise. 


(3) 


To quantify the quality of the acquired solutions, we define two additional mathematical constructs. 
First, the Loewner ordering is defined as; 

Definition Let be the space of n x n-symmetric matrices. The Loewner ordering ^ is a partial 
order on 5(„) such that Y ^ AC if and only if AC — Y is positive semidefinite. 

Having defined the Loewner order, we next define the notion of approximation for matrices “«q,”; 

Definition Let AC and Y be positive semidefinite symmetric matrices. Then X Y if and only 
iff 

e-“AC diY ^ e“AC (4) 

with A < B meaning B — A 'k positive semidefinite. 

Based on the above definitions, the following lemma represents the basic characteristics of the 
operator; 

Lemma 1. 1 1771? Let AC, Y, Z and, Q be symmetric positive semi definite matrices. Then 


(1) 

IfX'r 

«Q, Y, then AC 

+ Z^^Y + Z, 


(2) 

IfX'. 

«Q, Y and Z R 

"a Q, then X 4- Z 

~a Y Q, 

(3) 

IfX'. 

«Q,i Y and Y 

'~^Ct2 th^fJ Ctx-\-Ci2 

(4) 

IfX, 

and Y are non 

singular and X « 

a Y, then AC ^ 

(5) 

IfX'r 

«Q, Y and V is a matrix, then 

XV V^YV. 


Since the parallel solver returns an approximation, Zg, to Mq ^ (see Section]^, the following lemma 
shows that “good” approximations to Mq"^ guarantee “good” approximate solutions to MqX = 6g. 

Lemma 2. Let Zq and x = Z^bo, then x is y/2’^(e'^ — 1) approximate solution to 

MqX^ bg. 


Proof. The proof can be found in the appendix. 


□ 


2.3 The Solver 


The parallel SDDM solver proposed in ifTTIl is a parallelized technique for solving the problem of 
Section 2.1 It makes use of inverse approximated chains (see Definition |2.3[ ) to determine x and 
can be split in two steps. In the first. Algorithmic a “crude” approximation, a:g, of x is returned. 
Xq is driven to the e-close solution, x, using Richardson Preconditioning in Algorithmic Before 
we proceed, we start with the following two Lemmas which enable the definition of inverse chain 
approximation. 


Lemma 3. ? |77]? If M = D — A is an SDDM matrix, with D being positive diagonal, and A 
denoting a non-negative symmetric matrix, then D — AD~^ A is also SDDM. 


4 




Lemma 4. 4771/ Let M = D — A be an SDDM matrix, where D is positive diagonal and A a 
symmetric matrix. Then 


{D-A)-^ 


D-^ + {I + D-^ A) {D - AD-^ A) ^ {I + AD~^) 


(5) 


Given the results in Lemmas [^and|^ we now can consider inverse approximated chains of 

Definition Let C = {TVfo, ■ ■ ■; Md\ be a collection of SDDM matrices such that Mi = Di — 
Ai, with Di a positive diagonal matrix, and Ai denoting a non-negative symmetric matrix. Then C 
is an inverse approximated chain if there exists positive real numbers cq, ei,..., such that; 

(1) Di — Ai ^i—1 — Ai—iD^j^^^yAi—i V7 = 1,..., d, 

(2) A A- 1 , and 

( 3 ) Dd Dd — Ad¬ 
it is shown in ini that an approximate inverse chain allows for “crude” solutions to the system of 
linear equations in Dq — Aq in time proportional to the number of non-zeros entries in the matrices 
in the inverse chain. Such a procedure is summarized in the following algorithm: 


Algorithm 1 ParallelRSolve (TWq, TVfi,..., Md, bp) 

1: Input; Inverse approximated chain, {TVLq, Mi, ..., Md], and bo being 
2: Output; The “crude” approximation, Xq, of x* 

3: for 7 = 1 to (7 do 

4: bi = {I + A,_i b*-i 

5: end for 

6 : Xd = D~^bd 

7: for 7 = (7 — 1 to 0 do 

8: Xi — ^ [Dj ^bi -f (J -f 

9: end for 
10: return xq 


On a high level. Algorithm 2.3 operates in two phases. In the first (i.e., lines 3-5) a forward loop 
(up-to the length of the inverse chain d) computes intermediate vectors bi as: 

b, = + (6) 


for 7 = These can then be used to compute the “crude” solution Xq using a “backward” 

loop (i.e., lines 7-9). Consequently, the crude solution is computed iteratively backwards as: 


at* = i [A + {I + D. ^Ai) Xi+i] , 

with Xd = D^^bd and bi as defined in Equation]^ The quality of the “crude” solution returned by 
the Algorithm is quantified in the following lemma; 

Lemma 5. 4771/ Let {Mq, Mi, ..., Md} be the inverse approximated chain and denote Zq be the 
operator defined by ParallelRSolve {Mq, Mi, ..., Md, bp), namely, Xq = ZobQ. Then 


Zq 


Mq^ 


(7) 


Having returned a “crude” solution to MqX = bo, the authors in ifTTl obtain arbitrary close solutions 
using the preconditioned Richardson iterative scheme. The first step in the exact solver is the usage 
of Algorithm |2.3| to obtain the “crude” solution x- This is then updated through the loop in lines 4- 
8 to obtain an e-close solution to at*, see Algorithm]^ Following the analysis in ifTTl . Lemma 
provides the iteration count needed by Algorithm|^to arrive at x: 

Lemma 6. 4771/ Let {TVTo, Mi... Md} be an inverse approximated chain such that < 

7 In 2. Then ParallelRSolve {Mq, Mi, ..., Md, bp, e) requires q iterations to arrive at an e close 
solution ofx* with: q = O (log -j). 
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Algorithm 2 ParallelESolve {Mq, Mi, ..., M^, bg, e) 

1: Input; Inverse approximated chain {-Mg, Mi, ..., M^}, bg, and e. 
2: Output: e close approximation, x, of x* 

3: Initialize: yg = 0; 

X = ParallelRSolve {Mg, Mi,..., Md, bg) (i.e., Algorithm[^ 

4: for fc = 1 to q do 
5 : = Moyk-i 

6: = ParallelRSolve (^Mq, Mi, ..., Md, 

1 (2) I 

7 : Vk = Vk-i - ul'+ X 

8: end for 
9: x = yq 
10: return x 


3 Distributed SDDM Solvers 


Having introduced the parallel solver, next we detail our first contribution by proposing a distributed 
solver for SDDM linear systems. In particular, we develop two versions. The first, requires full 
communication in the network, while the second restricts communication to the R-Hop neighbor¬ 
hood increasing its applicability. Not only our solver improves the computational complexity of 
distributed methods for system of equations represented by an SDDM matrix, but can also be ap¬ 
plied to a variety of fields including distributed Newton methods, computer vision, among others. 


To compute the solution of SDDM systems in a distributed fashion, we follow a similar strategy 
to that of mi with major differences. Our distributed solver requires two steps to arrive at an e- 
close approximation to x*. Similar to O, the first step adopts an inverse approximated chain to 
determine a “crude” solution to x*. The inverse chain proposed in CH can not be computed in 
a distributed fashion rendering its immediate application to our setting difficult. Hence, we par- 
ways with mi by proposing an inverse chain which can be computed in a distributed fashion. 


This chain, defined in Section 3.1.1 enables the distributed computation of both a crude and exact 
solution to MqX — bg. Interestingly, due to the involvement of matrices with eigenvalues less than 
1, the length, d, of our inverse chain is substantially shorter compared to that of HD, allowing for 
fast and efficient distributed solvers. Given the crude solution, the second step computes an e-close 
approximation to x*. This is achieved by proposing a distributed version of the Richardson pre¬ 
conditioning scheme. Definitely, this step is also similar in spirit to that in HD, but generalizes 
the aforementioned authors’ work into a distributed setting and allows for e-close approximation 
to X* for any arbitrary e > 0. Main results on the full communication version of the solver are 
summarized in the following theorem: 

Theorem 1. There exists a distributed algorithm, 


A {{[Mo]ki, ■. ■ [Mo]kn}, [^>o]fc, e), 

that computes e-close approximations to the solution of MqX = bg in O log k log time 
steps, with n the number of nodes in Q, k the condition number of Mq, and [Afg]fc. the row of 
Mq, as well as e G (O, |] representing the precision parameter 


The above distributed algorithms require no knowledge of the graph’s topology, but do require the 
information from all other nodes (i.e., full communication) for computing solutions to MqX = bg. 
In a variety of real-world applications (e.g., smart-grids, transportation) load, capacity, money and 
resource restrictions pose problems for such a requirement. Consequently, we extend the previous 
solvers to an R-Hop version in which communication is restricted to the R-Hop neighborhood be¬ 
tween nodes for some R> Again we follow a two-step strategy, where in the first we compute 
the crude solution and in the second an e-close approximation to Xq is determined using an “R-Hop 
restricted” Richardson pre-conditioner. These results are captured in the following theorem: 

Theorem 2. There is a decentralized algorithm, 

A{{[Mo]ki, ■. ■ [Mo]kn}, [^>o]fc, R, e), 
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that uses only R-Hop communication between the nodes and computes e-close solutions to MqX = 
bo in 

O + aRdmax'^ log 

time steps, with n being the number of nodes in Q, dmax denoting the maximal degree, k the condi- 

f 

tion number of Mq, and a = min < n, > representing the upper bound on the size of the 

R-hop neighborhood Vv G V, and e G (0, being the precision parameter 

The remainder of the section details the above distributed solvers and provides rigorous theoretical 
guarantees on the convergence and convergence rates of each of the algorithms. We start by describ¬ 
ing solvers requiring full network communication and then detail the R-Hop restricted versions. 


3.1 Full Communication Distribnted Solvers 

As mentioned previously, our strategy for a distributed implementation of the parallel solver in Id 
requires two steps. In the first a “crude” solution is returned, while in the second an e-close approx¬ 
imation (for any arbitrary e > 0) to tCg is computed. 


3.1.1 “Crude” Distributed SDDM Solvers 


The distributed crude solver, represented in Algorithm resembles similarities to the parallel one 
of mi with major differences. On a high level. Algorithm operates in two distributed phases. 
In the first, a forward loop computes intermediate b vectors which are then used to update the 
crude solution of MqX = bp- The crucial difference to mi, however, is the distributed nature of 
these computations. Precisely, the algorithm is responsible for determining the crude solution for 
each node Vk G V. Due to such distributed nature, the inverse approximated chain used in im is 
inapplicable to our setting. Therefore, the second crucial difference to the parallel SDDM solver 
is the introduction of a new chain which can be computed in a distributed fashion. Starting from 
Mq = Dq — Aq, our “crude” distributed solver makes use of the following collection as the inverse 
approximated chain: 

C = {Aq, Dq, Ai, Di, ..., Ad, Dd}, (8) 


where Dk — Dq, and Ak = Dq (Z)g ^Ap) , for k = {1, ... ,d} with d being the length of the 
inverse chain. Note that since the magnitude of the eigenvalues of Dq^Aq is strictly less than 1, 

(Dp^^Ap)^ tends to zero as k increases which reduces the length of the chain needed. This length 
is explicitly computed in Section 3.1.3 for attaining e-close approximations to x*. 

It is relatively easy to verify that C is an inverse approximated chain, since: 


(1) A - Ai Di-i - Ai_i with e* = 0 for i = 1,..., d, 

(2) Di A -1 with ei = 0 for i = 1,... ,d, and 

( 3 ) Dd Dd — Ad- 

Algorithm [^returns the k*^ component of the approximate solution vector, [a;p]fc. As inputs it re¬ 
quires the inverse chain of Equationj^ the component of bp, and the length of the inverse chain. 
Namely, each node, G V, receives the k*^ row of Mq, the value of bp (i.e., [bp]fe), and the 
length of the inverse approximated chain d and then operates in two parts. In the first (i.e., lines 1-8) a 
forward loop computes the component of b exploiting the distributed inverse chain, while in the 
second a backward loop (lines 9-17) is responsible for computing the k*^ component of the“crude” 
solution [a:p]fc which is then returned. Essentially, in both the forward and backward loops each of 
the b and x vectors are computed in a distributed fashion based on the relevant components of the 
matrices, explaining the usage of N. loops in Algorithm]^ 

Theoretical Guarantees of Algorithm Due to the modifications made to the original parallel 
solver, new theoretical analysis quantifying convergence and accuracy of the returned “crude” solu- 
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Algorithm 3 DistrRSolve(^ {[Mo]fci,..., [Mo]fc„} , [bo]k,d 


Part One: Computing [bj\k 

[bi]k = [bo]fc + Y.j:vjeNi(vk)i-^oDQ^]kj[bo]j 
for z = 2 to d do 

for j : Vj e N 2 i-i (t’fe) do 


6: end for 

7: [bi]k = [bi-i]fc + 

8: end for 


J kj 


E n 
r—1 


1\2' 


(^o£»o-^) 




kr - 


kj 


[k- 


{AqDq 


iJi 


J jr 


9: Part Two: Computing [xo\k 

10 : [Xd]k = 

11 : for z = d — 1 to 1 do 


12: for 

j : Vj e N2i 

(Vk) 

do 




13: 


kj 

_ [Do]jj 

- l^r=l [Do].,. 

{D,^Aof-\ 

kr 

{Df^AQr-\ 


14: end for 


15: 

16: 

17: 

18: 


[Xi]k - 2[Dof, 

end for 

= 2[Dl]kk 

return: [a;o]fc 


[^i+i]fc+i 


+ -T ■ 

' 2 ^iw 


2 Z^iiUjGNjilufc) 


(Do^Ao 


kj 


[Xi+l]j 


[xi]k 

2 


- V - 
2 






tion is needed. We show that DistrRSolve computes the component of the “crude” approximation 
of X* and provide time complexity analysis. These results are summarized in the following lemm^ 

Lemma 7. Let Mq = Dq — Aq be the standard splitting of Mq. Let Zq be the operator defined by 
DistrRSolve{\{[MQ]ki ,..., [Mo]fcn}, [bo]k,d) (i.e., Xq = Z'^bo). Then 






Moreover, Algorithm^requires O {dri^^ time steps. 

In words, Lemma|^states that Algorithm|^requires O [dof) to arrive at an Cd approximation to the 
real inverse where this approximation is quantified using Definition 






Zg can then be used to compute the crude solution as Xq = Zgb. Note that the accuracy of approx¬ 
imating Mq is limited to motivating the need for an “exact” distributed solver reducing the error 
to any e > 0. 


3.1.2 “Exact” Distributed SDDM Solvers 

Having introduced DistrRSolve, we are now ready to present a distributed version of Algorit hm 
which enables the computation of e close solutions for MqX = bg. Contrary to the work of 111 II . 
our algorithm is capable of acquiring solutions up to any arbitrary e > 0. Similar to DistrRSolve, 
each node Vk GV receives the row of Mq, [bo]fej d and a precision parameter e as inputs. Node 
Vk then computes the component of the e close approximation of x* by using DistrRSolve as a 
sub-routine and updates the solution iteratively as shown in lines 2-6 in Algorithmic 

Analysis of Algorithm |C Here, we again provide the theoretical analysis needed for quantifying 
the convergence and computational time of the exact algorithm for returning e-close approximation 
to X*. The following lemma shows that DistrESolve computes the component of the e-close 
approximation of x^: 

^For ease of presentation, we leave the proof of the lemma to the appendix. 
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Algorithm 4 DistrESolve ({[Mpjfci,..., [Mo]fc„}, [bo]k,d, e) _ 

1: Initialize: [yo]k = 0; [x]k = DistrRSolve {{[Mo]ki, ■■■, [Mo]kn}, [bo]fe, d) (i.e., Algorithm]^ 


2 : 

3: 

4: 

5: 

6 : 

7: 

8 : 


for t = 1 to g do 


.( 1 ) 


u 


( 2 ) 


— [Dt)]kk[yt-l]k 

= DistrRSolve({[Mo]fci,..., [Mo]fc„}, \u. 


( 1 ) 


,d,) 


[2/i]/c [2/t—l]fc 

end for 


,( 2 ) 


[xh 


[x]k = [yq]k 
return \x\i^ 


Lemma 8. Let Mq = Dq — Ag be the standard splitting. Further, let < 

iln2 in the nverse approximated chain C — {Ag, Dg, Ai, Di,..., A^,-D^}. Then 

DistrESolve{{[MQ\ki, ■ ■ ■, [Aig]fc„}, [fegjfc, d, e) requires O (log i) iterations to return the com¬ 
ponent of the e close approximation for x*. 

The above lemma proofs that the algorithm requires O (log ^) iterations for attaining for returning 
the of the e-close approximation to x*. Consequently, the overall complexity can be summarized 
as: 

Lemma 9. Let Mq = Do ~ -^o be the standard splitting. Further, let Cd < 

I in 2 in the inverse approximated chain C = {Ag, L)g, Ai, Di,..., A^,-Dd}- Then, 

DistrESolve{{[Mo]ki, ■ ■ ■, [Mo]kn}, d, e) requires O [dnf log(i)) time steps. 

3.1.3 Length of the Inverse Chain 

Both introduced algorithms depend on the length of the inverse approximated chain, d. Here, 
we provide an analysis to determine the value of d which guarantees Cd < ^ in 2 in C = 
{Ag, Do, Ai,Di,..., Ad, Dd}: 

Lemma 10. Let Mo = Do — Ag be the standard splitting and k denote the condition number of 
Mq. Consider the inverse approximated chain 

C = {^Oj Do, Ai,Di,..., Ad, Dd], 

with d = [log (^2 in then Do Do - Do {Df^Ao)^ , with Cd < \ in 2. 

Proof. The proof will be given as a collection of claims: 

Claim: Let k be the condition number of Mo = -Dg — Ag, and denote the eigenvalues of 

Dg^^ Ag. Then, |Ai| < 1 — for alH = 1,..., n 

Proof. See Appendix. □ 

Notice that if Ai represented an eigenvalue of D^^ Ao, then A*" is an eigenvalue of {Df^AoY for 
all r S N. Therefore, we have 

p{{Do^Aof) ^ (l-^) (9) 

Claim: Let M be an SDDM matrix and consider the splitting M = D — A, with D being non 
negative diagonal and A being symmetric non negative. Further, assume that the eigenvalues of 
A lie between —a and /3. Then, 

(1-/3)D^D-A^ (l-ha)D. 
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Proof. See Appendix. 


□ 


Combining the above results, gives 


1 - 1 - 


1 


Dd f Dd — Ad f 


1 + 1 - 


Dd. 


Hence, to guarantee that D, 


d 


Dd — Ad, the following system must be satisfied: 


< 1 - 1 - 


1 


Introducing 7 for (l — i) , we arrive at: 

Cd > In 


and > 1 + 1 — 


1 


1-7 


, and ed>ln(l + 7 ). 


Hence 


, ed > max |ln (j ln(l + 7 )| = In ■ Now, notice that if d = [logctt] then, 7 = 

(1-s)^ = (1-Hence, In < In This gives c = r21n (^)l, 

implying ea = In ( 5 !^) < | In 2 . 


□ 


Using the above results the time complexity of DistrESolve with d = [log ^2In ^ 
O (vf logKlog(i)) times steps, which concludes the proof of Theoremj^ 


3.2 R-Hop Distributed Solvers 

The above version of the distributed solver requires no knowledge of the graph’s topology, but 
does require the information from all other nodes. Next, we will outline an R-Hop version of the 
algorithm in which communication is restricted to the R-Hop neighborhood between nodes. Due to 
such communication constraints, the R-Hop solver is general enough to be applied in a variety of 
fields including but not limited to, network flow problems (see Section]^. Along with Theorem]^ 
the following corollary summarizes the results of the R-Hop distributed solver: 

Corollary 1. Let Mq be the weighted Laplacian of Q = (V,£, TU). There exists a decentralized 
algorithm that uses only R-hop communication between nodes and computes e close solutions of 

MqX = bo in O ^ 108 ( 7 )^ steps, with n being the number of nodes in Q, W,„ax, '^min 

denoting the largest and the smallest weights of edges in Q, respectively, a = min 

representing the upper bound on the size of the R-hop neighborhood Vv G V, and e G (0, being 
the precision parameter. 

Similar to development of the full communication solver, the R-Hop version also requires two steps 
to attain the e-close approximation to x*, i.e., the “crude R-Hop” and the “exact R-Hop” solutions. 


(rfH+l-l) 


3.2.1 “Crude” R-Hop Distributed Solver 

The “crude R-Hop” solver uses the same inverse approximated chain as that of the full communi¬ 
cation version (see Equation]^ to acquire a “crude” approximation for the component Xq while 
only requiring R-Hop communication between the nodes. Algorithm represents the “crude” R- 
Hop solver requiring the inverse chain, component of bo, length of the inverse chain d, and the 
communication bound R as inputs. Namely, each node Vk G V receives the row of Mq , 
component, [bo]fc, of bo, the length of the inverse chain, d, and the local communication bouncj^i? 
as inputs to output the component of the “crude” approximation to x*. Algorithm operates 
in three major parts. Due to the need of the R-powers of AqD^^ and DoAq^, the first step is to 
compute such matrices in a distributed manner. 

"'For simplicity, R is assumed to be in the order of powers of 2, i.e., R = 2'’. 
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Algorithm 5 RDistRSolve (C, [6o]fc) d, R) 

1: Part One: 


2: {[Co]fci,..., [C'o]fc„} = fo ([Mo]fei,.. 

, [Mo]kn, R) 

3: {[Ci]fci,...,[Ci]fc„}=fi([Mo]fei,.. 

, [Mo]kn, R) 


4 

5 

6 

7 

8 

9 

10 


Part Two: 
for z = 1 to d do 

k-l = 2’"7« 

[ut\ = [AoD-X_,], 
[ui:%=h{[ut\) 

[bt]k = [bi-i]k + 

end for 


11 

12 

13 

14 

15 

16 

17 

18 

19 

20 


Part Three: 

[xd]k = [M7[r»o]fcfc 

for z = d — 1 to 1 do 


h = ‘^'/r 

= [-Dq ^Aoa;i+i]fe 

end for 


[^o]k = h 

return [iCo 


[bpjfc 

[■Dojfcfc 

k 


+ [xi]k + [Dq ^AQXi\k 


Algorithm 6 fp {[Mo]ki, ■. ■, [Mo]kn, R) 
1: for Z = 1 to i? — 1 do 
2: for j s.l.Vj e Ni+i(vk) do 

3: [(A„D„-‘)'+‘],^= E 

r:VrGNi(vj) 


[Dolrr 

[^o]jj 


[{AqDq ^YjkrlAoD^ 


4: end for 

5: end for 
6: return cq 


{[(AoD^Ylku 


l(AoDY%n} 


Given the inverse chain and the communication bound, R, fg (•) and fi (•) serve this cause as detailed 
in Algorithms]^ andrespectively. Essentially, these algorithms execute multiplications needed for 

determining (ADq^J and (^DAq^) in a distributed fashion looping over the relevant hops of 
the network. For a node, Vk S V, the component of these powers are returned to Algorithm]^ 

, zz}; see Part One in 


as [Co]fci — [AqDq 
Algorithm]^ 


f and [CY = {DoAY" . for z = {1 


- ki 


R 


. ki 


Similar to the full communication version, the second two parts of the “crude R-Hop” solver run two 
loops. In the first, the component of bi is computed by looping forward through the inverse chain, 
while in the second the component of the crude solution is determined by looping backwards. 


The second part of the solver is better depicted in the flow diagrams of Figures |l(a)| and |l(b)| Within 
the first loop running through the length of the inverse chain, the condition z — 1 < p is checked. In 
case this condition is true, Ao, Dg, and the previous iteration vector are used to update bi as 
shown in Figure [T(a)| 


l^ilk - + 


,(*- 1 ) 


J k 


II 



































Algorithm 7 fi([Mo]fci,..., [Mo]fc„, R) 


1 

2 

3 

4 

5 

6 


for Z = 1 to i? — 1 do 

for j s.t.Vj e Ni+i{vk) do 

end for 
end for 

return ci = {[(Dq ^Ao)^]fei,..., [{Dq^A o)^]kn} 




Figure 1: The left figure depicts the first condition of part two in Algorithm which has to be 
checked. In case i — 1 < p, the computations shown in the figure are executed. The right figure 
handles the case when i — 1 > p. Here, the computations shown in the figure are executed. Note, 
that the chains are constructed using the computations returned by fo(-) and fi(-). 


This is performed using another loop constructing a series of [wj* vectors for j = 1,..., 2* 
used to update the component of bi at the iteration: At the next iteration, the condition 
i — 1 < p is checked again. In case this condition is met, the previous computations are executed 
again. Otherwise, the commands depicted in Figure|l(b)|run. Here, a temporary variable denoting 
the fraction to the communication bound R, /j_i = is used to determine the upper iterate 

bound in j. Again throughout this loop, a series of u vectors used to update [bi]k are constructed. 

Having terminated the forward loop, the component of the “crude” solution is computed by 
looping backward through the inverse approximated chain (see part three in Algorithm]^. For i 
running backwards to 1, an R-Hop condition, p, is checked. In case i < p, the following update is 
performed: 


(*+i)i 


[»7. 


for j = 2 ,..., 2®, i = c? — 1,..., 1, and 


\k = 






(i+i) 


J k 




The role of rj are backward intermediate solutions needed for updating the “crude” solution to 
Mqx = bo: 


{x^]k = ^ 


[h 


'i\k 


+ [xi]k + 


[Do\kk 

for a node Vk G V. In case i > p a similar set of computations are executed for updating the crude 
solution using: 


\xi]k — 2 


[b^] 




0\kk 


[Xl]k 


[Do^Ax,] 


Analysis of Algorithm [^Similar to the previous section, we next provide the theoretical analysis 
needed for quantifying the performance of the crude R-Hop solver. The following Lemma shows 
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that RDistRSolve computes the component of the “crude” approximation of x* and provides 
the algorithm’s time complexity; 

Lemma 11. Let Mq — Dq — Aq be the standard splitting and let Zq be the operator defined by 
RDistRSolve, namely, atg = then Furthermore, RDistRSolve requires 

f2’^ \ 

(D [ ^ o. -f (xRdjyidx 1 5 


where a = min 


to arrive at Xq. 


Proof. The proof of the above Lemma can be obtained by proving a collection of claims; 

Dq Aq) and yA^Df ) have sparsity patterns corresponding to the r-Hop 
neighborhood for any r £ N. 


The above claim is proved by induction on R. We start with the base case; for R = 1, 


[AqDq \ij — 


if j ;G NiK) 
0 otherwise. 


Therefore, AqD^^ has a sparsity pattern corresponding to the 1-Hop neighborhood. Assume that 
for all 1 < p < R— 1, [AqDq^Y has a sparsity pattern corresponding to the p— hop neighborhood. 
Consider, (^A^^Dq^Y 


\{A^D^Y%3 = Y^[{AoD^Y^-Yk[AoD^%, ( 10 ) 


Since AqDq ^ is non negative, then [(AoJTq YYtj ^ iff there exists k such that Vk G 
andrtfc £ Ni(t;j), namely, £ Nfl(rti). The proof can be done in a similar fashion for □ 


The next claim provides complexity guarantees for fo( ) and fi( ) described in Algorithms]^ and 
respectively. 

Claim; Algorithmsanduse only the R-hop information to compute the k*^ row of (D^^Aq)^ 


and (^AqDq , respectively, in O {aRd^ 


time steps, where a = min n, 


Proof. The proof will be given for fo( ) described in Algorithm|^as that for fi (•) can be performed 
similarly. Due to Claim [LTI] we have 


[AqDq 


1\ 


J kj 


IL 

^ [AoDf^]^^ = Y. [AoD, 


r—1 


r:VrGNi{vj 

+ 1 . 


J rj 


( 11 ) 


Therefore at iteration I + 1, Vk computes the k*^ row of (^ AqDq using; 


(1) the row of {AqDq and 

(2) the column of A^Df^. 

Node Vr, however, can only send the row of A^Df^ making AqD^^ non-symmetric. 
Noting that [AoDQfri/lDo]^^ = [^oO^^]jr-/[Do]jj, since Df^AoD~^ is symmetric, leads to 
[{AoDQ^y+'^]kj = X) ^^f^[iAoDy^y]kr[AoDQyjr. To prove the time complexity 

r:i;,.GNi(-U3) 

guarantee, at each iteration Vk computes at most a values, where a = min < n, *e 

upper bound on the size of the R-hop neighborhood Vv £ V. Each such computation requires at 
most 0{dmax) operations. Thus, the overall time complexity is given by 0{aRdmax)- n 
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We are now ready to provide the proof of Lemma [TT] 


Proof. From Parts Two and Three of Algorithm it is clear that node Vk computes 
[bi]fc, [b 2 ]k, ■ • •, [bd]k and [xd]k, [xd-i]k, • ■ •, [xo]k, respectively. These are determined using the 
inverse approximated chain as follows 


-1n2- 
0 ) 


Ji-1 


bi — {I + — bi_i + (AqDi 

Xi = 2 [-^i ^bi + (/ + = 2 [-^0 ^bi + aJi+i + [Dq ^Aq)^ a^i+i] 


( 12 ) 


Considering the computation of [6i]fc,..., [bd]k for p > i — 1, we have 

[bi]k = [bi-i]k + [{AoDq bi-i]k = [bi_i]fc + [AqDq ^. AqDq ^ 


= [bi-i]k + [AqD^ 


-K .. AoDf^ 


u\ 


2i-l 

\k ■ ■ ■ = \bi_i]k + u 


(i-1) 


with u 


= AqDq for j = 1,... 2* ^ — 1. Since AqDq ^ has a sparsity pattern corre¬ 


sponding to 1-hop neighborhood (see Claim 


3.2.1 1 , node computes 


u 


(i-i) 

'j+i 


, based on u] 


(z-l) 


acquired from its 1-Hop neighbors. It is easy to see that Vt such that i — 1 < p the computa¬ 
tion of [bi]k requires O {2^~^dmax) time steps. Thus, the computation of [bi]fe,..., [bp]k requires 
0(2^(in,ax) = 0{Rdmax)- Now, cousider the computation of [bi]k but for i — 1 > p 


[bi]k = [bi-i]k + [(^o-Dg 


bi-i]k = [bi-i]k + [Cq .^. CQ hi-i]k 

^i-l 


— \bi-i\k + \Co ■ ■ ■ C(d u\ ^]fc — \bi-i\k + 

“v* 

h-l-l 


‘'h-1 


- k 


with Cg = {AqDq li_i = and u 


= CgM^* for j = 1, ■ ■ • , k-i - 1- Since Cp 
has a sparsity pattern corresponding to R-hop neighborhood (see Claim 3.2. 1[ ), node Vk computes 
based on the components of u^j attained from its R-hop neighbors. For each i such 


mm < n 


(^max 1) 


that i — 1 > p the computing [bi]k requires O time steps, where a = 

being the upper bound on the number of nodes in the R— hop neighborhood V d G V. There¬ 
fore, the overall computation of [6p+i]fc, [bp+ 2 ]fcj ■ ■ • j [bd\k is achieved in O time steps. 

Finally, the time complexity for the computation of all of the values [bi]fc, [b 2 ]fc, • • ■, [bd]fc is 
Rdmax^- Similar analysis can be applied to determine the computational complexity 
of [xd\k, [tC d-ilfc, ■ ■ •, [xi]k, i-e.. Part Three of Algorithm]^ We arrive at Zq Finally, 


o(¥o. 


using Claim 


3.2.1 


the time complexity of RDistRSolve (Algorithm is O + aRdr, 


□ 


3.2.2 “Exact” R-Hop Distributed Solver 

Having developed an R-hop version which computes a “crude” approximation to the solution 
of MqX = bo, now we provide an exact R-hop solver presented in Algorithm Similar to 
RDistRSolve, each node Vk receives the row Mq, [bo]fc> d, R, and a precision parameter e 
as inputs, and outputs the component of the e close approximation of vector x*. 

Analysis of Algorithm]^ The following Lemma shows that EDistRSolve computes the com¬ 
ponent of the e close approximation to x* and provides the time complexity analysis. 

Lemma 12. Let Mq = Dq — Aq be the standard splitting. Further, let td < ^/s In 2. Then Algo- 
rithm^requires O (log i) iterations to return the component of the e close approximation to 

X*. 
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Algorithm SEDistRSolve ({[Mpjfci,..., [Mo]fc„}, [bo]k, d, R, e) _ 

Initialize; [yo]k = 0, and [x]k = RDistRSolve({[Mo]fei,..., [Mo]kn}, [bo]k,d,R) 
for t = 1 to g do 

[iJ-t ^]/c = [Do\kk[yt-l]k ~ 

= RDistRSolve({[Mo]fci,..., [Mo]kn}, [u[^'^]k,d,R) 

[yt]k = [yt-i]k - + [x]k 

end for 

return [x]k = [yq]k 


The following Lemma provides the time complexity analysis of EDistRSolve; 

Lemma 13. Let Mq = Do ~ -^o be the standard splitting and let Cd < In 2, then EDistRSolve 
requires O {{'^‘^/Ra + ctRdmax) log (^/e)) time steps. Moreover, for each node Vk, EDistRSolve only 
uses information from the R-hop neighbors. 


Length of the Inverse Chain: Again these introduced algorithms depend on the length of the inverse 
approximated chain, d. The analysis in Section 3.1.3 can be applied again to determine the d = 


riog(21n(^)/c)l 


as the length of the inverse chain. 


3.3 Comparison to Existing Literature 

As mentioned before, the proposed solver is a distributed version of the parallel SDDM solver 
of im. Our approach is capable of acquiring e-close solutions for arbitrary e > 0 in 

O {n^ ^ log ( 7 ) ^, with n the number of nodes in graph Q, and Winin denoting the largest 
and smaller weights of the edges in Q, respectively, a. = min < n, > representing the upper 

I f^max -1- J 

bound on the size of the R-Hop neighborhood Vu S V, and e € (0, j] as the precision parameter. 
After developing the full communication version, we proposed a generalization to the R-Hop case 
where communication is restricted. 

Our method is faster than state-of-the-art methods for iteratively solving linear systems. Typical 
linear methods, such as Jacobi iteration m, are guaranteed to converge if the matrix is strictly 
diagonally dominant. We proposed a distributed algorithm that generalizes this setting, where it is 
guaranteed to converge in the SDD/SDDM scenario. Furthermore, the time complexity of linear 
techniques is logn), hence, a case of strictly diagonally dominant matrix Mo can be easily 

constructed to lead to a complexity of 0{n^ log n). Consequently, our approach not only generalizes 
the assumptions made by linear methods, but is also faster by a factor of log n. Furthermore, such 
algorithms require average consensus to decentralize vector norm computations. Contrary to these 
methods which lead to additional approximation errors to the real solution, our approach resolves 
these issues by eliminating the need for such a consensus framework. 

In centralized solvers, nonlinear methods (e.g., conjugate gradient descent 0211361, etc.) typically 
offer computational advantages over linear methods (e.g., Jacobi Iteration) for iteratively solving 
linear systems. These techniques, however, can not be easily decentralized. For instance, the stop¬ 
ping criteria for nonlinear methods require the computation of weighted norms of residuals (e.g., 
||Pfe||Mo "'ith pk being the search direction at iteration k). To the best of our knowledge, the dis¬ 
tributed computation of weighted norms is difficult. Namely using the approach in 13^ . this requires 
the calculation of the top singular value of Mo which amounts to a power iteration on M^Mo lead¬ 
ing to the loss of sparsity. Furthermore, conjugate gradient methods require global computations of 
inner products. 

Another existing method which we compare our results to is the recent work of the authors Il34ll 
where a local and asynchronous solution for solving systems of linear equations is considered. In 
their work, the authors derive a complexity bound, for one component of the solution vector, of 

O ^min ^^, with e being the precision parameter, d a constant bound on the 

maximal degree of Q, and G is defined as a: = Gx -f z which can be directly mapped to Ax = b. 
The relevant scenario to our work is when A is PSD and G is symmetric. Here, the bound on 
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the number of multiplications is given by O ^min (^d ^ with k{A) be¬ 

ing the condition number of A. In the general case, when the degree depends on the number of 
nodes (i.e., d = d(n)), the minimum in the above bound will be the result of the second term ( 
i) leading to O (d{n)nK{A) In i). Consequently, in such a general setting, our ap¬ 
proach outperforms 041 by a factor of d{n). 

Special Cases: To better understand the complexity of the proposed SDDM solvers, next we detail 
the complexity for three specific graph structures. Before deriving these special cases, however, we 
first note the following simple yet useful connection between weighted and unweighted Laplacians 
of a graph Q. Denoting by B the incidence matrix of Q and W a diagonal matrix with edge weights 
as diagonal elements, we can write; 

Cg = B^B and £^"'“g>'ted) ^ b^WB. 

Hence, we can easily establish: 

Mn {Cg) and (^Iweigh.ed)^ > 


This implies that the condition number of the weighted Laplacian satisfies: 


\L 


(weighted) \ 






(weighted) \ 


^2 


^ '^max / £. \ 

< - K{Cg) , 

ICmin 


with rUmin ^ud lUmax ^Te the minimal and maximal edge weights of Q. Using the above, we now 
consider four different graph topologies; 


3.3.1 Path Graph 

Similar to the hitting time of a Markov chain on ^ath graph which is given by 0{n^), the time 
complexity of the R-Hop SDDM solver is given b}|j 

Corollary 2. Given a path graph Vn with n nodes, the time complexity of the R-Hop SDDM solver 
is given by: 

for any e > 0 and for k = m = 0{yGi). 


TsoDuiVn) = O ( n^log 


3.3.2 Grid Graph 


Recognizing that a grid graph Gkxm can be represented as a product of two path graphs, Gkxm = 
Vk X Vm, our solver’s computational time can be summarized by: 


Corollary 3. Given a grid graph, Gkxm, the time complexity of the distributed SDDM-solver can 
be bounded by: 


Tsddm (Gkxm) = O ( nlog - 


for any e > 0. 


3.3.3 Scale-Free Networks (Polya-Urn Graphs) 

Using the results developed in ll^HOl the total time complexity of the distributed R-Hop solver is 
bounded by: 

Corollary 4. Given a scale-free network, GsN{n), the time complexity of the R-Hop SDD solver for 
R=1 is given by: 

Tsdd = O log nlog . 

^Due to space constraints, the proofs can be found in the appendix. 
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3.3.4 d- Regular Ramanujan Expanders 

For d regular Ramanujan expanders in which d does not depend on n, we have (-CRExp((i)) < 2d 

and fj .2 (>CRExp(d)) > d — 2\/d — 1. Hence, the time complexity of the SDD-solver is given by 
constant time. 

The developed R-Hop distributed SDDM solver is a fundamental contribution with wide ranging 
applicability. Next, we develop one such application from. We apply our solver for proposing an 
efficient and accurate distributed Newton method for network flow optimization. Namely, the dis¬ 
tributed SDDM solver is used for computing the Newton direction in a distributed fashion up-to any 
arbitrary e > 0. This results in a novel distributed Newton method outperforming state-of-the-art 
techniques in both computational complexity and accuracy. 


4 Distributed Newton Method for Network Flow Optimization 

Conventional methods for distributed network optimization are based on sub-gradient descent in 
either the primal or dual domains, see. For a large class of problems, these techniques yield iterations 
that can be implemented in a distributed fashion using only local information. Their applicability, 
however, is limited by increasingly slow convergence rates. Second order Newton methods Q are 
known to overcome this limitation leading to improved convergence rates. 

Unfortunately, computing exact Newton directions based only on local information is challenging. 
Specifically, to determine the Newton direction, the inverse of the dual Hessian is needed. Determin¬ 
ing this inverse, however, requires global information. Consequently, authors in ||5] 0 ID proposed 
approximate algorithms for determining these Newton iterates in a distributed fashion. Accelerated 
Dual Descent (ADD) Q, for instance, exploits the fact that the dual Hessian is the weighted Lapla- 
cian of the network and performs a truncated Neumann expansion of the inverse to determine a local 
approximate to the exact direction. ADD allows for a tradeoff between accurate Hessian approxi¬ 
mations and communication costs through the N-Hop design, where increased N allows for more 
accurate inverse approximations arriving at increased cost, and lower values of N reduce accuracy 
but improve computational times. Though successful, the effectiveness of these approaches highly 
depend on the accuracy of the truncated Hessian inverse which is used to approximate the Newton 
direction. As shown later, the approximated iterate can resemble high variation to the real Newton 
direction, decreasing the applicability of these techniques. 

Contributions: Exploiting the sparsity pattern of the dual Hessian, here we tackle the above problem 
and propose a Newton method for network optimization that is both faster and more accurate. Using 
the above developed solvers for SDDM linear equations, we approximate the Newton direction up- 
to any arbitrary precision e > 0. This leads to a distributed second-order method which performs 
almost identically the exact Newton method. Contrary to current distributed Newton methods, our 
algorithm is the first which is capable of attaining an e-close approximation to the Newton direction 
up to any arbitrary e > 0. We analyze the properties of the proposed algorithm and show that, 
similar to conventional Newton methods, superlinear convergence within a neighborhood of the 
optimal value is attained. 

We finally demonstrate the effectiveness of the approach in a set of experiments on randomly gener¬ 
ated and Barbell networks. Namely, we show that our method is capable of significantly outperform¬ 
ing state-of-the-art methods in both the convergence speeds and in the accuracy of approximating 
the Newton direction. 

4.1 Network Flow Optimization 

We consider a network represented by a directed graph Q = {Af, S) with node set N} 

and edge set f = {!,..., E}. The flow vector is denoted by a; = with representing 

the flow on edge e. The flow conservation conditions at nodes can be compactly represented as 

Ax — b, 

where A is the N x E node-edge incidence matrix of Q defined as 
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1 if edge j leaves node i 
— 1 if edge j enters node i 
0 otherwise, 



and the vector b € I-*- denotes the external source, i.e., 5*-®^ > 0 (or 6*^®) < 0) indicates 6^®^ units of 
external flow enters (or leaves) node i. A cost function $e : K —K is associated with each edge 
e. Namely, denotes the cost on edge e as a function of the edge flow We assume that 

the cost functions are strictly convex and twice differentiable. Consequently, the minimum cost 
network optimization problem can be written as 

E 

min (13) 

e=l 

s.t. Ax = b 

Our goal is to investigate Newton type methods for solving the problem in[^in a distributed fashion. 
Before diving into these details, however, we next present basic ingredients needed for the remainder 
of the paper. 

4.2 Dual Subgradient Method 

The dual subgradient method optimizes the problem in Equation [T^by descending in the dual do¬ 
main. The Lagrangian, ((•) : x M, is given by 

E 

l{x, A) = - ^ + \^{Ax - b). 

e=l 


The dual function q{\) is then derived as 


qW 


inf l(x,\)= inf 

aiSR® xeRE 


e=l 


-X^b 


y inf - A^b. 


Hence, it can be clearly seen that the evaluation of the dual function q{\) decomposes into E one¬ 
dimensional optimization problems. We assume that each of these optimization problems have an 
optimal solution, which is unique by the strict convexity of the functions $e. Denoting the solutions 
by x*-®! (A) and using the first order optimality conditions, it can be seen that for each edge, e, x^®^ (A) 
is given bjj^ 

x(^)(A) = [<be]-'(A«-A(^)), (14) 

where i G Af and j G N denote the source and destining nodes of edge e = respectively 

(see 0 for details). Therefore, for an edge e, the evaluation of x^®^ (A) can be performed based on 
local information about the edge’s cost function and the dual variables of the incident nodes, i and 

j- 

The dual problem is defined as min^gRW 9(A). Since the dual function is convex, the optimization 
problem can be solved using gradient descent according to 


Afe+i = Afc - UkOk for all fc > 0, (15) 

with k being the iteration index, and gk = 9 (Afe) = Vg(Afe) denoting the gradient of the dual 
function evaluated at A = A^. Importantly, the computation of the gradient can be performed as 
Qk = Ax (Afc) — b, with x{Xk) being a vector composed of xl®^ (Afc) as determined by Equation[l4| 

®Note that if the dual is not continuously differentiable, the a generalized Hessian can be used. 


18 



Further, due to the sparsity pattern of the incidence matrix A, the element, of the gradient 
Qk can be computed as 

E E (16) 

e=(ij) e={j,i) 


Clearly, the algorithm in Equation[^can be implemented in a distributed fashion, where each node, 
i, maintains information about its dual, and primal, x^'^'UAk), iterates of the outgoing edges 
e = (i, j). Gradient components can then be evaluated as per 16 using only local information. Dual 
variables can then be updated using 15 Given the updated dual variables, the primal variables can 


be computed using 14 


Although the distributed implementation avoids the cost and fragility of collecting all information 
at centralized location, practical applicability of gradient descent is hindered by slow convergence 
rates. This motivates the consideration of Newton methods discussed next. 


4.3 Newton’s Method for Dual Descent 

Newton’s method is a descent algorithm along a scaled version of the gradient. Its iterates are typi¬ 
cally given by 

Afc+i = Afe + afcdfc forallfc>0, (17) 

with dk being the Newton direction at iteration k, and Uk denoting the step size. The Newton direc¬ 
tion satisfies 

Hkdk = -Qk, (18) 

with Hk = H (Afc) = V^q{\k) being the Hessian of the dual function at the current iteration k. 


4.3.1 Properties of the Dual and Assumptions 


Here, we detail some assumptions needed by our approach. We also derive essential Lemmas quan¬ 
tifying properties of the dual Hessian. 

Assumption 1. The graph, Q, is connected, non-bipartite and has algebraic connectivity lower 
bound by a constant uj. 

Assumption 2. The cost functions, ^e{’)> iw Equation\l 3\are 


1. twice continuously differentiable satisfying 

1 < ^e(-) < r, 

with 7 and T are constants; and 

2. Lipschitz Hessian invertible for all edges e S £ 

1 1 


^e{x) 


< 5\x — x\. 


The following two lemmas l5]|6l quantify essential properties of the dual Hessian which we exploit 
through our algorithm to determine the approximate Newton direction. 

Lemma 14. The dual objective q{X) = A^(Aa:(A) — b) — abides by the following 

two properties mm: 


1. The dual Hessian, H{\), is a weighted Laplacian ofQ: 

H{\) = V2g(A) = A [vV(a^(A))]'' 

2. The dual Hessian H (A) is Lispshitz continuous with respect to the Laplacian norm (i.e., \ \ ■ 
II/;) where C is the unweighted laplacian satisfying C = AA^ with A being the incidence 
matrix ofQ. Namely, VA, A.' 

||iT(A)-iT(A)||/; <i?||A-A||/;, 

with B = where piniC) and pi 2 {Cf denote the largest and second smallest eigen- 

'r\/k'2{C) 

values of the Laplacian C. 
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Proof. See Appendix. 


□ 


The following lemma follows from the above and is needed in the analysis later: 

Lemma 15. If the dual Hessian H(X) is Lipschitz continuous with respect to the Laplacian norm 
I, then for any A and A we have 

||Vg(A) - Vg(A) - H{\){\ - A)||£ < |||A - A|||. 

Proof See Appendix. □ 

As detailed in ia, the exact computation of the inverse of the Hessian needed for determining the 
Newton direction can not be attained exactly in a distributed fashion. Authors in fSlIS proposed ap¬ 
proximation techniques for computing this direction. The effectiveness of these algorithms, however, 
highly depends on the accuracy of such an approximation. In this work, we propose a distributed 
approximator for the Newton direction capable of acquiring e-close solutions for any arbitrary e. Our 
results show that this new algorithm is capable of significantly surpassing others in literature where 
its performance accurately traces that of the standard centralized Newton approach. 


\c (i.e.. Lemma 


14 


4.4 Accurate Distributed Newton Methods 


Using the results of the distributed R-Hop solver, we propose a novel technique requiring only R- 
Hop communication for the distributed approximation of the Newton direction. Given the results 
of Lemma 14 we can determine the approximate Newton direction by solving a system of linear 
equations represented by an SDD matritinwith Mq = Hk = H (A^). 

Formally, we consider the following iteration scheme: 






(19) 


with k representing the iteration number, ak the step-size, and dj. denoti ng the approximate Newton 


direction. We determine dk by solving H^dk = —Qk using Algorithm 
our approximation of the Newton direction, dk, satisfies 


3.2.2 


It is easy to see that 


|dfe - dfcllfffe < e||dfc||fjfe with dk = -Zkgk, 


3.2.2 


The accuracy of this ap- 


where Zk approximates hI according to the routine of Algorithm 
proximation is quantified in the following Lemma 

Lemma 16. Let Hk = H (A^) be the Hessian of the dual function, then for any arbitrary e > 0 we 
have 

e~'^ H\v < ZkV < e*^ H\v, Vv G I"*". 


Proof See Appendix. 


□ 


4.4.1 Convergence Guarantees 


Given such an accurate approximation, next we analyze the iteration scheme of our proposed method 
showing that similar to standard Newton methods, we achieve superlinear convergence within a 
neighborhood of the optimal value. We start by analyzing the change in the Laplacian norm of the 
gradient between two successive iterations 

Lemma 17. Consider the following iteration scheme A^+i = A^ + o-kdk with ak G (0,1], then, 
for any arbitrary e > 0, the Laplacian norm of the gradient, | l^fc+i 11£, follows: 


l|gfe-i-ilk < 


1 ~ Ct/c -f CtfeC 


PnjC) 

P2{C) 



QkWc 


alBT'^il + ef 

2nl{C) 


\\9k\\^ 




( 20 ) 


with p-ni^) <^nd P 2 {^) being the largest and second smallest eigenvalues of C, T and 7 denoting the 
upper and lower bounds on the dual’s Hessian, and B G M. is defined in Lemma [75] 


’For ease of presentation, we refrain some of the proofs to the appendix. 
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Proof. See Appendix. 


□ 


At this stage, we are ready to present the main results quantifying the convergence phases exhibited 
by our approach: 

Theorem 3. Let 7, F, B be the constants defined in Assuniption^and Lemma 14 /r„(£) and 
P2(>C) representing the largest and second smallest eigenvalues of the normalized laplacian C, 

e € ^0, precision parameter for the SDDM solver, and letting the optimal step-size 

parameter a* = ^ r (£) ) • proposed algorithm given by the + a*dk 


exhibits the following three phases of convergence: 

1. Strict Decreases Phase: While ||gfe||£ > m: 

1 

q{\k+i) - qi>^k) < -7 


7" 2 


2(l + e)2r2 pf^iC) 


Vi- 


2. Quadratic Decrease Phase: While < ||gfc||£ < Pi-' 

\\gk+i\\c < —WokWl- 
m 


3 . Terminal Phase: When ||gfc||£ < po-' 

llfl'fc+ilk < A l-a*+a*e 


\ 


PnjC.) 

P2[C) 


WdkWc, 


where po = and pi = with 


I = 


\ 


1 — a* + a*€ 


Pn{C) T 
P2{C) V 7 


with ^ = 


B{a*r{l + e)f 

2pl{C) 


( 21 ) 


Proof We will proof the above theorem by handling each of the cases separately. We start by con¬ 
sidering the case when ||gfe||£ > pi (i.e., Strict Decrease Phase). We have: 

9(Afe+i) = q{^k) + dki^k-vi ~ ^k) + 2 ~ ^k)~^ H {z){Xk+i — Afc) 

Q!^ 2 

= q{\k) -t- atgjdk -f -^^H{z)dk < q{\k) + aug^dk + -^dlCdk, 

z z7 

where the last steps holds since H{-) ^ 1^- Noticing that \\dk \\l < WdkWl (see Ap- 

pendix), the only remaining step needed is to evaluate g^dk- Knowing that dk = —ZkQk, we 
recognize 

gjdk = -gJZkgk < e~^^glHlgk (Lemma[T6]) 


< -- 


Pni^k) 


gkgk < - 


gn{C) 


T ^ 6^7 gl^gk e " 7 2 

gkgk < -TTTTA .. fr: = WTrFlWdkWc, 


Pri{C,) Pn{C) pI{C) 


where the last step follows from the fact that Vv G 


pn . ,,T 


: V V > 


Cv 


g(Afc+i) - g(Afe) < - 


( 7 m 2(£) y 


,r^(l + e)" 
2ppl{C) 


. Therefore, we can write 


\\gk\\ 


2 

£• 


It is easy to see that Uk = ot* = ^7e)2 r ^ (£) j minimizes the right-hand-side of the above 

equation. Using ||gfe||£ gives the constant decrement in the dual function between two successive 
iterations as 

.A A .A A/ 1 2 

q{ k+i) q{ k) - 2(l + e)2r2p^(^)'^^' 
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Considering the case when rjo < HgfcHlryi (i.e., Quadratic Decrease Phase), Equation!^ can be 
rewritten as 

\\9k+i\\c<e\\gk\\c + C\\gk\\h 

with 4 and ^ defined as in EquationEurther, noticing that since ||gfc||£ > go then H^fcllc < 
I Isa; I li = I Ififfc III- Consequently the quadratic decrease phase is finalized by 


l|g.+i||£ < C +1) lig.lll = 

Einally, we handle the case where | l^fcl |£ < Vo (i-C-, Terminal Phase). Since | l^fel || < ryoHSfellcj it 
is easy to see that 


IlSfc+lIk < (1^ + Ct7o)||fl'fe||£ = (1^ + 1(1 - l))||fl'fe||£ 

= 4ll9fc||z: = A l-a*+a*e 


\ 


Mn(>C) 


M2(-C) V 7 


gk\\c- 


□ 


4.4.2 Iteration Count and Message Complexity 

Having proved the three convergence phases of our algorithm, we next analyze the number of itera¬ 
tions needed by each phase. These results are summarized in the following lemma: 

Lemma 18. Consider the algorithm given by the following iteration protocol: = Aj. -f o*d}^. 

Let Aq be the initial value of the dual variable, and q* be the optimal value of the dual function. 
Then, the number of iterations needed by each of the three phases satisfy: 


1. The strict decrease phase requires the following number of iterations to achieve the 
quadratic phase: 

I —-1 -2 




1 - e 




'g2{C) y 7 

where Ci = Ci (e, 7, E, d, q{Xo), q*) = 2S'^{1 + e)^ [^(Ao) - g*] 

2. The quadratic decrease phase requires the following number of iterations to terminate: 


N 2 = l0g2 


5 log 2 ( 1 - a* (1 - e 


ti2(C) Y 7 


l0g2(?') 


where f = :^ \\gk' \ |c. with k' being the first iteration of the quadratic decrease phase. 

3. The radius of the terminal phase is characterized by: 


Pterminal 


< 


2 



k-2(C) 





gn{Cf)\J g2{Cf). 


Proof. See Appendix. 


□ 


Given the above result, the total message complexity can then be derived as: 


0{{Ni+ N 2 ) np ( K{Hk)- + Rdn 


log 
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4.4.3 Comparison to Existing Literature 


Recent progress towards distributed second-order methods applied to network flow optimization 
adopt an approximation scheme based on the properties of the Hessian. Most of these methods 
(e g-^ Gl) handle the simpler consensus setting and thus are not directly applicable to our more 
general setting. Closest to this work is that proposed in a, where the authors approximate the 
Newton direction by truncating the Neumann expansion of the pseudo-inverse of the Hessian. This 
truncation, however, introduces additional error to the computation of the Newton direction leading 
to inaccurate results especially on large networks. The primary contrast to this work is that our 
method is capable of acquiring e-close (for any arbitrary e) approximation to the Newton direction. In 
fact, the proposed method is almost identical to the exact Newton direction computed in a centralized 
manner (see Section |431 l. 

Next, we formally derive the iteration counts for our method on four-special cases as benchmark 
comparisons. Since the main contribution (due to the presence of loglog(-) term in N 2 ) in the iter¬ 
ation count is given by the strict decrease phase (see Lemma [T8|l, we develop these results in terms 
of Ni. Also note that the corollaries provided below are substantially harder to acquire when com¬ 
pared to the standard Newton analysis due to the dependence of some constants, e.g, m and M on the 
graph structure. Opposed to the analysis performed by other methods, our analysis explicitly handles 
such dependencies leading to more realistic and accurate mathematical insights. This explains the 
relatively high dependency on n in some cases. Note, that in such case where these relations (i.e., 
parameter dependency on the graph structure) were not taken into account, substantial decrease in 
the complexity can be achieved at the compensate of accurate mathematical description. 

Path Graph: 

Corollary 5. Given a path graph Vn with n nodes, the strict-decrease phase of the distributed 
Newton method is given by: 



Grid Graph: 

Corollary 6. For a grid graph Gkxm the strict-decrease phase of the distributed Newton method is 
given by: 



Scale-Free Graph: 

Corollary 7. For a scale free grid graph GsN(n) with n, the strict-decrease phase is given by: 



d-Regular Ramanujan Expanders: For such expanders the iteration count for the strict-decrease 
phase can be bounded by a constant. 

4.5 Experiments and Results 

We evaluated the proposed distributed second-order method in three sets of experiments on ran¬ 
domly generated and Barbell networks. The goal was to assess the performance on networks ex¬ 
hibiting good and bad mixing times. We compared our algorithm’s performance to: 1) exact-newton 
computed in a centralized fashion, 2) Accelerated Dual Descent (ADD) with two different split¬ 
tings 0161, 3) dual sub-gradients, and 4) the fully distributed algorithms for convex optimiza¬ 
tion im (FDA). An e of Yg-ggQ, a feasibility threshold of 10 and an R-Hop of 1 were provided 
to our SDDM solver for determining the approximate Newton direction. For all other methods free 
parameters were chosen as specified by the relevant papers. Feasibility and objective values were 
used as performance measures. 
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Iterations 

(a) 



Iterations 

(b) 




(c) (d) 


Figure 2: Performance metrics on two randomly generated networks, showing the primal objective, 
/ {xk), and feasibility ||Aa;fc — b|| as a function of the number of iterations fc on a loglog scale. 
On a relatively small network (i.e., 20 nodes and 60 edges- Figures (a) and (b)) we outperform all 
method by an order of magnitude and trace the trajectory of exact Newton computed in a centralized 
fashion. On larger networks (i.e., 50 nodes and 150 edges-Figures (c) and (d)), SDDM-Newton is 
superior to all other algorithms, where it converges 5 orders of magnitude faster. 


4.5.1 Experiments & Results on Random Graphs 

Two experiments on small (20 nodes, 60 edges) and large (50 nodes, 150 edges) random graphs were 
conducted. The random graphs were constructed in such a way that edges were drawn uniformly at 
random. The flow vectors, b, were chosen to place source and sink nodes diam((/) apart. 

Results summarizing the primal objective value, = exp([a;](®)) -|-exp(—and feasi¬ 

bility, 11 Aajfc — 6| |, on both networks are shown in Figure 3. On small networks (i.e., 20 nodes and 60 
edges), all algorithms perform relatively well. Clearly, our proposed approach (titled SDDM-Newton 
in the figures) outperforms, ADD, Sub-gradients, and the approach in BTIl with about an order of 
magnitude. Another interesting realization is that SDDM-Newton accurately tracks the exact New¬ 
ton method with its direction computed in a centralized fashion. The reason for such positive results, 
is that our algorithm is capable of approximating the Newton direction up-to-any arbitrary e > 0 
while abiding by the R-Hop constraint. For larger networks. Figures [2(c)| and [2(d)| SDDM-Newton 
is highly superior compared to other approaches. Here, our algorithm is capable of converging in 
about 3-5 orders of magnitude faster, i.e., we converge in about 3000 iterations as opposed to 6000 
for ADD which showed the best performance among the other techniques. It is worth noting that we 
are again capable of tracing exact Newton due to the accuracy of our approximation. 
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(b) 


Figure 4: Feasibility and objective results on a 60 node bar graph. Again in these sets of experiments, 
our method is capable of outperforming all other methods. Furthermore, SDDM-Newton is again 
capable of tracing the exact Newton method. 


4.5.2 Experiments & Results on Bar Bell Graphs 


In the second set of experiments we assessed the performance of SDDM-Newton on a barbell graph. 
Figure of 60 nodes. The graph consisted of two 20 node cliques connected by a 20 nodes line 
graph is depicted in. We assign directed edges on the graph arbitrary and solve the network flow 
optimization problem with a cost of = exp([a;](®^) + exp(—[at ](«)). 

Results in Figure |4(aj] and |4(b)| demonstrate the 
superiority of our algorithm to state-of-the-art 
methods. Here, again we are capable of con¬ 
verting faster than other techniques in about 
2 magnitudes faster. It is worth noting that 
the second-best performing algorithm is ADD 
which is capable of converging in almost 3000 
iterations. 



Figure 3: A high-level depiction of a barbell graph Finally, we repeated the same experiments on 
showing two cliques connected by a line graph. a larger bar bell network formed of 120 nodes 

(two cliques with 40 nodes connected by a 40 
node line graph). These results, demonstrated 
in Figures |5(a)| and |5(b)| again validate the previous performance measures showing even better 
performance in both the objective value and feasibility. 


4.5.3 Measuring Message Complexity 

Though successful, the experiments performed in the previous section show accuracy improvements 
without demonstrating per-iteration message complexities needed. To have a fair comparison to 
state-of-the-art methods, in this section we report such results on four different network topologies. 
Here, we show that our method requires a relatively slight increase in message complexity to trace 
the exact Newton direction. 

These per-iteration values were determined by deriving bounds to each of the benchmark algorithms. 
For all methods except SDDM-Newton and that in Q, such complexity can be bounded by (!I(dmax) 
with dmax being the maximal degree. As for SDDM-Newton, the per-iteration complexity is upper- 
bounded by 0(/«(£p)djnax), with h{Cq) being the condition number of the graph Laplacian. For the 
algorithm in 0 the message complexity satisfies 0(K(£p)ciniax log n) with n being the total number 
of nodes in Q. Immediately, we recognize that our algorithm is faster by a factor of log n compared 
to that in 0. Compared to other techniques, however, our method is slower by a factor of K{Cg). 


25 


































(b) 


Figure 5: Feasibility and objective results on a 120 node bar graph. Again in these sets of experi¬ 
ments, our method is capable of outperforming all other methods. Furthermore, SDDM-Newton is 
again capable of tracing the exact Newton method. 


■ SDDM-Newton ■ ADD ■ FDA ■ Sub Gradients 

8 



Figure 6; Message complexity results comparing our 
method to state-of-the-art techniques on four different net¬ 
work topologies. Graphs 1 and 2 correspond to random net¬ 
works with 20 nodes, 60 edge and 100 nodes and 500 edge, 
respectively. Graphs 3 and 4 report message complexities 
needed for two barbell graphs. 


To better quantify such a difference, 
we perform four sets of experiments 
on two randomly generated and two 
barbell networks with varying size. 
The random networks consisted of 20 
nodes, 60 edges, and 120 nodes and 
500 edges, respectively. Moreover, 
the barbell graphs followed the same 
construction of the previous section 
with sizes varying from 60 to 120 
nodes. Comparison results showing 
the logarithm of the message com¬ 
plexity are reported in the bar graph 
of Figure These demonstrate that 
SDDM-Newton requires a slight in¬ 
crease in the message complexity to 
trace the exact Newton direction. 


5 Conclusion 


In this paper we proposed a dis¬ 
tributed solver for linear systems de¬ 
scribed by SDDM matrices. Our ap¬ 
proach distributes that in m by 
proposing the usage of an inverse approximated chain which can be computed in a distributed fash¬ 
ion. Precisely, two solvers were proposed. The first required full communication in the network, 
while the second restricts communication to the R-Hop neighborhood between the nodes. 

We applied our solver to network flow optimization. This resulted in an efficient and accurate dis¬ 
tributed second-order method capable of tracing exact Newton computed in a centralized fashion. 
We showed that similar to standard Newton, our methods are capable of achieving superlinear con¬ 
vergence in a neighborhood of the optimal solution. We extensively evaluated the proposed method 
on both randomly generated and barbell graphs. Results demonstrate that our method outperforms 
state-of-the-art techniques. 
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6 SDDM Solver Proofs 

In this appendix, we provide the complete and comprehensive proofs specific for the distributed 
SDDM solver. 

Lemma 19. Let Zq and x = Z^b^. Then x is — 1) approximate solution of 

MqX ^ bo- 


Proof Let x* G M" be the solution of MqX — bp, then 

II®* ~ ^IImo = (®* ~ x)^M q{x* — x) = {x*YM qX* + {xYM qX — 2{x*YMqX ( 22 ) 

Now, consider each term in p2]l separately; 


1. = bjMp ^M^Z^bo = bJZobo 

2 . {x*YMox* = b'^Mf^MoMf^bo = blMf^bo < e^Zobo 

3. x^JSL qX — b^ZoAL^Zf^bo f- e^b^Z^bQ 


Note that in the last step we used the fact that Zq «£ Mq ^ implies Mq 
can be rewritten as: 


Zq Therefore, (221 


\Mo 


< 2(e'' — l)bJ.Zob| 


\Mo 


\x — at] 

{x*YM 

< 2(e^ - l)e’^(x*)'MoX* = 2{e’' - l)e"||£c- 


(23) 


Combining (23 i with bJZobo = {x*YM qZqMox* < e^{x*Y 

11^* I I 2 ^ 1 \ \ T 7\/T 1 \ I I I I 2 


\Mq 


□ 


Lemma 20. Let Mq = Dq — Aq be the standard splitting of Mq. Let Zq be the operator defined 
by DistrRSolve{[{[MQ]ki ,..., [Mo]fe„}, [bp]*, d) (i.e., Xq = Z^bo). Then 

Z'o 

Moreover, Algorithm^requires O (dnY time steps. 


Proof The proof commences by showing that (^Df^AQY and (^AqDY) ^ have a sparsity pattern 
corresponding to the r-hop neighborhood for any r G N. This case be shown using induction as 
follows 


1. If r = 1, we have 

Moji 

[Do]i 

0 otherwise . 

Therefore, AqDq^ has sparsity pattern corresponding to the 1-Hop neighborhood. 




2. Assume that {AqDq has a sparsity patter corresponding to the p-hop neighborhood for 
all p : 1 < p < r — 1. 
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3. Now, consider {AqDq ^y, where 

n 

[(Aor>o-')1., = Y.[{AoD^y^-%yAoD^%, (24) 

k^l 

Since AqDq^ is non negative then it is easy to see that [{AoDQ^y]ij 7 ^ 0 if and only if 
there exists k such that Vk S Nr.-i(i’i) and Vk S Ni(rtj) (i.e., Vj G Nr(iti)). 


For £)g ^Aq, the same results can be derived similarly. 

Please notice that in Part One of DistrRSolve algorithm node Vk computes (in a dis¬ 
tributed fashion) the components [bi]fc to [bd]k using the inverse approximated chain C = 
{^ 0 , -Do, Ai,Di, ...,Ad, Dd}. Formally, 


b, = 


{AqDq 


bi-i — bi-i -f (AqDq 


(^oDq 


bi-i 


2 j. — 2 

Clearly, at the iteration node Vk requires the row of (i.e., the row from 

21-2 

the previous iteration) in addition to the row of [AqDq^) from all nodes Vj G N 2 i-i (vk) 

2 ^ — 1 

to compute the row of (AqDq^) 


For computing 


(AoDo^y 


, node Vk requires the k*^ row and column of (^AqDq 

kj 



2 * — ^ 

The problem, however, is that node Vj can only send the row of which can be 

easily seen not to be see that symmetric. To overcome this issue, node Vk has to compute the 


2 ^ — ^ ^ 
column of based on its row. The fact that (^AoD^y is symmetric, 

manifests that for r = 1,... ,n 


[AoD^y 


J rj 




J jr 


[D, 


Ojrr 


Hence, for all r = 1,..., n 




\D, 


Ojrr 


J rj 






[Do]jj 


[AoD^y 


(25) 


J jr 


Now, lets analyze the time complexity of computing components [^ 2 ]^, ■ - ■, [bd]k- 


2 ^^ 

Time Complexity Analysis: At each iteration i, node Vk receives the row of (^AoD^y 
from all nodes Vj G N 2 i-i (ffc). using Equation]^ node Vk computes the corresponding columns 

as well as the product of these columns with the k*^ row of . Therefore, the time 

complexity at the iteration is O {v? +diam ((/)), where is responsible for the k*^ row 
computation, and diam(C/) represents the communication cost between the nodes. Using the fact 
that diam (G) < n, the total complexity of Part One in DistrRSolve algorithm is O (^dry). 

In Part Two, node Vk computes (in a distributed fashion) \xd-i\k: \xd- 2 ]k: • ■ ■, [:ro]fe using the 
same inverse approximated chain C = {Aq, Dq, Ai, Di ,..., A^, Dd]. 


Xi — 2^0 2 


^ A (Dq Aq) 

+ l{D^^Aof~\D,^Aof~'x,+, 


for z = d — 1,..., 1. Thus, 


= ^Dg ^bo + Y + ^ (Dg ^Ao) xi 


( 26 ) 
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Similar to the analysis of Part One of DistrRSolve algorithm the time complexity of Part Two as 
well as the time complexity of the whole algorithm is O (dn^). 

Finally, using Lemmafor the inverse approximated chain C = {Aq, Dq, -Di,..., D^i} 
yields: 


□ 

Lemma 21. Let Mq — Dq — Aq be the standard splitting. Further, let Cd < 
|ln2 in the nverse approximated chain C — {Aq,-Dq) Ai, Z3i,..., A^, Then 

DistrESolve{{[MQ\ki, • ■ •, [ALoJfcn}) [^o]fcj d, e) requires O (log i) iterations to return the com¬ 
ponent of the e close approximation for x*. 


Proof Notice that iterations in DistrESolve corresponds to Preconditioned Richardson Iteration: 

Vt = [I — Z'qMq] yt-i + Z^bo 

where Zq is the operator defined by DistrRSolve and yo = 0- Therefore, from Lemma|^ 

Finally, applying Lemma|^gives that DistrESolve algorithm needs O (log iterations to com¬ 
ponent of the e approximated solution for x*. □ 

Lemma 22. Let Mq = Dq — Aq be the standard splitting. Further, let Cd < 
I In 2 in the inverse approximated chain C — {Ag, Dg, Ai, Di,..., A^, Then, 

DistrESolve{{[MQ\ki, •.., [Aig]fc„}, [bgj^, d, e) requires O [dnf log(^)) time steps. 


Proof. Each iteration of DistrESolve algorithm calls DistRSolve routine, therefore, using the above 
the total time complexity of f DistrESolve algorithm is O (^dri^ 108 ( 7 )) time steps □ 

Lemma 23. Let Mq = Dq — Ag be the standard splitting. Further, let Cd < t /3 In 2. Then Algo¬ 
rithm 8 requires O (log -) iterations to return the component of the e close approximation to 

X*. 


Proof. Please note that the iterations of EDistRSolve correspond to a distributed version of the 
preconditioned Richardson iteration scheme 

Vt = [I — Z^Molut-i -F Zgbg 

with t/Q = 0 and Zg being the operator defined by RDistRSolve. Prom Lemma 3.8 it is clear that 
Zg Afg”^. Applying Lemma 2.12, provides that EDistRSolve requires O (log i/e) iterations to 
return the component of the e close approximation to x*. Pinally, since EDistRSolve uses pro¬ 
cedure RDistRSolve as a subroutine, it follows that for each node only communication between 
the R-hope neighbors is allowed. □ 

Lemma 24. Let Mq — Dq — Ag be the standard splitting and let Cd < In 2, then EDistRSolve 
requires O -F aRdmax) log (^/e)) time steps. Moreover, for each node v^, EDistRSolve only 

uses information from the R-Hop neighbors. 


Proof. Notice that at each iteration EDistRSolve calls RDistRSolve as a subroutine, therefore, for 
each node Vk only R-hop communication is allowed. Lemma 3.8 gives that the time complexity 

of each iteration is O {^oi -F aRdmax'^, and using Lemma 3.9 immediately gives that the time 
complexity of O (( 2 ‘^/fia -F aRdmax) log (^A)). □ 


31 


7 Distributed Newton Lemmas 

Lemma 25. The dual objective q{\) = A^(j4a;(A) — b) — $e(a;(A)) abides by the following 

two properties [6]: 

1. The dual Hessian, H{\), is a weighted Laplacian of Q: 

H(X) = W\{X) = A [vV(a=(A))] 

2. The dual Hessian H (A) is Lispshitz continuous with respect to the Laplacian norm (i.e., \ \ ■ 
ll^j where L is the unweighted laplacian satisfying C = AA^ with A being the incidence 
matrix of Q. Namely, VA, A; 

||Jf(A)-iL(A)||£ <B||A-A||£ 

with B = where and /X 2 (>C) denote the largest and second smallest eigen- 

'r^fi2(c) 

values of the Laplacian C. 


Proof For the first part see Lemma 1 in [ 6 ]. So, lets prove the second part: 

Lets denote FF(A) = [V^/(a;(A))]“^, then: 

H{X)-H{X) = A[WiX)-W{X)]A'^ (27) 

LFsing that ||S'||£ = sup^gi^ ^ lets fix some v ^ and consider the expression 
||A[W^(A)-FF(A)]ATr;||2: 

||A[FF(A)-LF(A)]A'rr;||^ = 
v'^A[W{X) - W{X)]A'^CA[W{X) - FF(A)]A'rr; 
v'^A[W{X) - FF(A)](A'^A)^[VF(A) - W(A)]A'^r; <' 
pI{C)v^A[W{X) - WiX)]^A'^v < 
pl{C)pl{\WiX) - WWDv-^AA-^v 

= pl{£)pli\WiX) - W{X)\)\\v\\l 


We used in step (^) C = AA^, and in step (^) we used that p,n{-^A) = p,n{AA^) 
Therefore, we have: 


||A[ty(A) - WiX)]A'^\\c < M„(/:K(|t^(A) - ty(A)l) 
Now, we upper bound the expression/i„(ITL(A) — TF(A)|): 

1 1 


TnilWiX) 


TF(A)I) < max 

e^S 


$e(a=e(A)) $e(a=e(A)) 


< 


(5max|a:e(A) — a;e(A)| 
eGS 


Pn{£)- 


(28) 

(29) 


In the last transition we used Assumption 2. Now, using formulae for the derivative of the inverse 
function we have: 


d 

dX, 


$]-^(A) 


1 ^ 1 
$([$]-i(A)) - 7 


(30) 


Hence, [4>] ^(A) is bounded, and therefore [$] ^(A) is Lipshittz continuous with constant L' = 2. 

Now, because £Ce(A) = [$]“^(Ai—A^), we have that £Ce(A) is Lipshitz continuous with correspond¬ 
ing constant L'. Hence, Ve G £: 


\xeiX) - a;e(A)| < -||A - A ||2 


(31) 


Now we are ready to prove the following 
Claim: For all e € and for any A, A: 


|a:e(A)-a:e(A)|<i^^^ 
7 s/p2i£) 


(32) 
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Proof. Consider three cases: 


1. A — A S In this case, using that Vu € : 


< ^l^indSlll: 


~ P‘2(.C) 


|a:,(A) - a:e(A)| < -||A - Aj^ < 

7 7 Vfi2{C) 


2. A — A G Span{l} In this case || A — Ajj^ = 0, and we have A = A+al, hence Xi — Xj = 
Xi + a — {Xj + a) = Xi — Xj, which gives: 


x^X) = [$]-i(A, - A,) = [$]-'(A, - A,) = x,(A) 


kl —1/ 


Therefore, 


|a;e(A) - ate(A)| = 0 = 


Consequently, is valid. 


'y \JP'2{C) 


3. A —_A =jui + U 2 , where tti G I-*-, G Span{l}. In this case ||A — A||£ = ||mi||£, 
and Xi — Xj = Xi — Xj + ui{i) — ui{j). Notice that the same expression for Xi — Xj will 
be in the case when A = A + Mi. Hence, using the first case which proves the claim: 


,(A) - a;e(A)| = |[$]-'(A, - A,) - [#]-i(A. - A,)| = 


|[$] (A*-Aj+ui(i)-ui(j))-[^] (Ai - Aj)| <-||mi ||2 <- 


1 ll«il 


1 IIA-A| 


"f \/7 \/ P2{^) 

□ 


Combining the above claim with (29 1 gives: 


/i„(|iy(A)-T¥(A)|)<-J'^ 


7 


(33) 


Using (33 1 in (281 leads us to: 


||ff(A) -If(A)||£ <i?||A-A||£, 


where B = 


IJ"n{C)S 


□ 


Lemma 26. If the dual Hessian H(X) is Lipschitz continuous with respect to the Laplacian norm 
II • ||£ (i.e.. Lemma 7), then for any X and X we have 

||Vg(A) - Vg(A) - H{X){X - A)||£ < |||A - A|||. 


Proof We apply the result of Fundamental Theorem of Calculus for the gradient Vg which implies 
for any vectors A and A in K" we can write 


Vg(A) = Vg(A) + / H{X + f(A - A))(A - A) dt, 


(34) 


We proceed by adding and subtracting H (A) (A — A) to the integral in the right hand side of (34 1 . 
It follows that 


Vg(A) = V(?(A) + 


(35) 


H{X + t{X - A)) - H{X) (A - A) + H{X){X - A) dt. 
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: can separate the integral in ([3^ into two integrals as 

Vg(A) = Vg(A) + [ \h{X + t{X - A)) - H{X)] (A - A) dt 


H{X){X - A) dt. 


( 36 ) 


The second integral in the right hand side of p6| ) does not depend on t and we can simplify the 
integral as H (A) (A — A). This simplification implies that we can rewrite (36 1 as 

Vq{X) = Vg(A) + H{X){X - A)+ (37) 

[ \H{X + t{X-X))-H{X)]{X-X)dt, 


By rearranging terms in ( [J7] ) and taking the norm of both sides we obtain 
||Vg(A) = Vg(A) - TT(A)(A - A)|U = 

r-l r 

H{X + t{X- X))- H{X) {X-X)dt 
H{X + t{X - A)) - H{X)] (A - A) 


10 


< 


c 

dt 


Now we are ready to prove the following 

Claim: Let H (A) be the Hessian of the dual function ^(A). Then, for any v G 

\\[HiX) - HiX)] t;||^ < \\HiX) - Tf(A)||^ ||t;||^ 


(38) 


(39) 


Proof. Consider three cases: 


1. v G 1-^. In this case (39i follows immediately from the definition: 

IklU ■ 


\c = 


sup^giJ 


2. V £ Span{l}. In this case ||rt||£ = 0 and_[iT(A) — H{X)]v = H{X)v — H{X)v = 
0 — 0 = 0 (because H{-)1 = 0). Hence, ( [^ is coiTect 

3. V = + M 2 , where ui G 1^,M2 G Span{l}. In this case ||rt||£ = ||mi||£, and 

||[jf(A)-fT(A)]r;||^= ||[jf(A)-ff(A)] {u^+u^)\\^ = 

||[jf(A) - TT(A)] «i||^ \\HiX) - T?(A)|U||mi||£ = 

||Tf(A)-/T(A)||£||t;||£ 

where in step (^) we used the first case result. 

This proves the claim 


□ 


Applying the above claim to ( p8] ) gives: 

||Vg(A) - Vg(A) - Tf(A)(A - X)\\c < 

f \\H{X + t{X - A)) - TT(A)||£f||A - X\\cdt 


/ B\\X - X\\ct\\X - X\\cdt = B\\X - X\\l tdt 
Jo Jo 

= -||A- A||2 
2 " 

where in step (^) we used the fact that ff( ) is Lipshitz continuous with respect to the laplacian norm 

II-Ik- □ 
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Lemma 27. Let 

have 


= H{\k) be the Hessian of the dual function, then for any arbitrary e > Owe 
e~'^ v~^H^v < v~^Zj^v < v~^H^v, \/v G I"*" 


Proof. Lets be the collection of eigenvalues of H^. and are corresponding eigen¬ 

vectors. Then 

n n ^ 

Hk=Y, Hl=Y, ( 40 ) 

i=2 i=2 Pi 

where we use = 0 and = 1. Now lets fix some <5 > 0 and consider the matrix Hk,s = 
Sr =2 P't'^i + i5ll^ = fffc + 511^. The corresponding linear system will have the form: 


^ k,6^k Qk 


and the operator Zk,s dehned by by EDistRSolve routine for (|4r|l satisfies: 

Vv G 


" ]v < v'^Zksv < e'^ v~''hAv, 


(41) 


(42) 


Notice that -^boldsymbolu^^^u^^^'^i + Hence, taking 


V G 1^ in (42 1 : 


rri 


V ' Hlv < V ' Zk,sv < e'^ v' H]^v, Vr; G I"*" 

The last step is to take the limit i5 —> 0 in (43 i and notice that Zk,s Zk'. 

hJ,v < ZkV < e’^ v~^ hJ,v, \/v G I"*" 


(43) 


□ 


Lemma 28. Consider the following iteration scheme A^+i = A*. -|- with G (0,1], then, 
for any arbitrary e > 0, the Laplacian norm of the gradient, | |gfc+i | Ic, follows: 


IlSfc-i-illt: < 


1 ~ Ctfe + Q:fc£ 


Pn{^) 

P2{C) 



\\gk\U 


iBT^l + ey 

2^2(£) 


'\\9k\\l 


(44) 


with pn (>C) ond p 2 (>C) being the largest and second smallest eigenvalues of C, T and 7 are constants 
from Assumption 2, and B G^ is defined previously. 


Proof. Because the dual function q{X) has Hessian which is Lipschitz continuous with respect to 
the laplacian norm || • ||l, we can write: 

lisfc+i — 9k — Hk{Xk+i — Afe)||£ < -^||Afc+i — Afclll (45) 


Using Afc+i = Afc + akdk can be rewritten as: 



ol^ B 

\\gk+i gk oikHkdkWc < 2 c 

(46) 

Therefore, 


(47) 

Since \ \yk H” ^k^kdkWc- Let — d]^ c^, then. 



\\ck\\Hk < epfcllfffe 

(48) 

and 

\\gk + akHkdklWc = \\gk + akHxidk + Ck)\\c = 

(49) 


\\gk - akgk + akHkCkWL < (1 - a:k)\\gk\\c + ak\\HkCk\\c 
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Therefore, we need to evaluate ||iTfcCfe||£: 

\\HkCk\\l = clHkLHkCk < iJ,n{L)clHlck 
< ^in{C)^j.niHk)clHkCk < = 

ry 


-elHlHtHlgi = 




7 


Hence, 


7 l4{L) 




\\HkCk\\c < e 


M 2 (^) V 7 


-Iklk 


Combining the above gives: 

llfffc + afc-fffcrffcllz: < 

Therefore, we have: 


1 ~ ctfe + ctfce 


^^n{C) T 

V 7 


\\gk\U 


0-2 B ~ 

llfffc+ill < Il5fe + «fci^fc4||£ + ^11411^ < 


1 ~ ctfc + ttfce 


Mn(^) /r 
M 2 (^) V 7 


.a^Br 2 (l + e)^ 2 

2nl{L) 


(50) 


(51) 


□ 

Lemma 29. Consider the algorithm given by the following iteration protocol: A^+i = A^+i + 
a* dk- Let Aq be the initial value of the dual variable, and q* be the optimal value of the dual 
function. Then, the number of iterations needed by each of the three phases satisfy: 

1. The strict decrease phase requires the following number iterations to achieve the quadratic 
phase: 


tU^) 


1 - e 


Tnjc) r 
M 2 (^) V 7 


where Ci = Ci (e, 7 , T, S, q{Xo), q*) = 26^(1 + e)^ [q{Xo) - q*] 

2. The quadratic decrease phase requires the following number of iterations to terminate: 


N 2 = l 0 g 2 


h log 2 ( 1 - a* ( 


l-a* 

’ ii2{c) y 7 


log 2 (r') 


where t" = :^\\gk' \ Iz:. with k' being the first iteration of the quadratic decrease phase. 
3. The radius of the terminal phase is characterized by: 


Pterminal f: 


1 - e 


k-n.(C) 

k-2(.C,) Y 7 


o — e^'rS 


-p,n{L)\/p,2{L). 


Proof. We will start with strict decrease phase. From Theorem 1: 


q{\k+i) - q{>^k) < - 


1 e 


- 2 €^ 


7^ Mi(i) 2 


2(l + e)2 r2 pi{L) 


Vi 


(52) 
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where rji = with: 


A = 


1 


a* + a*€ 


Mn {L'j 
^^2{L) 



B = 


S(a*r(l + e ))2 

2^2 (L) 


Hence: 


1 6 

q{\k+i) - q{Xk) < -7; 


- 2 e^ 


7^ , 

71 ^ 


.- 2 e^ 


2 (l + e)2r2;r4(L) 
73 Ati(i) (1-71)2 _ 


2(l + e)2 r2 ^^UL) 


132 

1 e-2^^ fi'iiL) (l--4)2 

2 (1 + e)2 r2 fJ,f^{L) f BiaT(l+e)r 


^ e-2^^ 7 " m!(L) (1-71)2 
(l + e)6r6 ^4(i) B^{a*r 


(l + e)6 re ^4(L) 


(1-71)2 



- 2 e 2 ^'(l + e )2 
- 2 e 2 '^"(l + e )2 


7^ Ati(i) S 2 


(1-71)2 = 


1 


7 ' 


k.AL)i y 

7-\/t‘2(i) / 


r(l-7l)2 = 


- 2 e 2 '^"(l + e )2 


r 2 ^2 (r) (1-71)2 
7^ ^ 2 ( 1 -) ^2 


(53) 


Now, notice that: 


1-A = 


1-A^ 
1 + 7l 


Moreover, because 0 < 7l < 1, therefore: 


\{l-A^f < {1-Af < (l-yl2)2 


(54) 
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Therefore, we can write: 


Denote 


^(Afe+i) — (?(Afc) < —2e^'^ (1 + e) 
-2e2'^"(l + e)2 


[l-Af 


< 


1^ ^^2{L) ^2 

rV^(£) 1 (i-^ 2)2 _ 

7^ i^2[L) ^2 4 


1 

~ 2 ® 




2 e^/i I ^ f„*\2 

(1 + ^) T7T7T7TT2(“ ) 


7^ M 2 (i) 


1 - e 


A^nW /r 
f^2{L) V 7 


_ 1 2 e"/, , N 2 r Vn(T) 1 / e ^ /7/r 2 ^Y 

2 ^ ^ S'M2(i)CM (l + e)HrM„(i)y' 


1 - e 


^^niL) /r^' 

M2(i^) V 7 

1 1 7 

'2CMi + e)2r2 ^2(2,) 

1 7 MiW 


1 - e 


/in(i) /r 


2 ^ 2 ( 1 + e) 2 r 2 ^ 2 ( 2 ,) 


1-e 


M 2 (i) y 7 

M2(i) V T 


(5* = 


1 


7 MiW 


1 - e 


f^n{L) r 

M2(i) V 7 


(55) 


(56) 


2 e(l + e)2r2 ^2(i) 

then from ( |55| l we have: 

q{Xk+l) - q{>^k) < -S* 

Hence, the number of iterations required by the algorithm for the strict decrease phase is upper- 
bounded by: 

{qiXo)-q*) 


Ni < 


27(l + 7l<,(Ao)-<,Y|ffl 


1 - e 


^^n{L) r 
M 2 (i) V 7 


where q* - optimal value of dual function. 


Now, lets analyze the quadratic decrease phase. We have, for po < ||pfc||L < Vi- 

lk+i||L<^lklli 

Hence, 

-Ik+ilk< f-lkiu) (57) 

t?i V7i / 

Now, denote I be the first iteration when quadratic phase is achieved, i.e ||gi||i, < pi, therefore, for 
k +1 iteration: 

— \\gk+i\\L < (—\\gk+i-i\\L) < • ■ ■ < llffiiu) = 
m \m } V7i ) 

where we use notation r — ■^\\gi\\L,i" & (Ojl)- Hence, the number or iterations ADD-SDDM 
algorithm requires to reach terminal phase is given by the following condition: 

< Vo 


38 































which immediately gives: 


k > log2 


l0g2 (t) 


’log2 7l' 

log2r 

— log2 

_ log 2 r 


Hence, 


N 2 = l0g2 


5 log2 ( 1 - a* (1 - e 


P-rijL) 

A‘2(i) V 7 


logar 


Finally lets consider the radius of the terminal phase: 


Pterminal — PO — 


A{l-A)^l- A 


a 


B 


< 


B 


1 _ 

B(aT(l+e))2 

27iJ(L) 


1 - e 


M 2 (i) 


7? 


-Pn{L)Vp2{L) 


□ 
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